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The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 



A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) OR THIRTY (30) DAYS, 
WHICHEVER IS LONGER, FROM THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1.136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1 )^ Responsive to communication(s) filed on 10 December 2003 . 
2a)D This action is FINAL. 2b)S This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 11, 453 O.G. 213. 

Disposition of Claims 

4) ^ Claim(s) 1-14 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) [>3 Claim(s) 7-74 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10)^ The drawing(s) filed on is/are: a)Q accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1.85(a). 

Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 
1 1 )□ The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 119 

12)^3 Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 
aM AH b)D Some * c)D None of: 

1 .[3 Certified copies of the priority documents have been received. 

2. D Certified copies of the priority documents have been received in Application No. . 

3. Q Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 
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DETAILED ACTION 

Claim Rejections - 35 USC § 101 

1. 35 U.S. C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

2. Claims 6-8 and 13 are rejected under 35 U.S.C. 101 because the claimed 
invention is directed to non-statutory subject matter. The applicant stated in the 
preamble of those claims "a document extracting program allowing a computer to serve 
as", this in fact does not clarify how the program causes the computer to execute certain 
method. The program on its own is an abstract idea and does not have particular use, 
and similarly the computer does not have any use if there is no software is embedded 
on the hard drive. 

Claim Objections 

3. Claim 3 is objected to under 37 CFR 1 .75(c), as being of improper dependent 
form for failing to further limit the subject matter of a previous claim. Applicant is 
required to cancel the claim(s), or amend the claim(s) to place the claim(s) in proper 
dependent form, or rewrite the claim(s) in independent form. The claim 3 present the 
limitation that has been already introduced in the claim 2, from which claim 3 depends 
from. 

4. Claims 4, 8 and 1 1 are objected to because of the following informalities: 
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The acronym "TFIDF" needs to be spelled out so there is no doubt about the 
meaning of this acronym, for instance, TFIDF (Term Frequency Inverse Document 
Frequency). Appropriate correction is required. 



Claim Rejections - 35 USC §112 

5. The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

6. Claims 1, 6, 9, 12 and 13 are rejected under 35 U.S.C. 112, second paragraph, 
as being indefinite for failing to particularly point out and distinctly claim the subject 
matter which applicant regards as the invention. 

7. In the last line of the respective claims, the applicant states, "any number of 
documents is extracted from among a group of the documents", the examiner is not 
certain which group of documents the applicant is referring to. Is the collection of 
documents primarily acquired or is the group of document already selected based on 
the similarity outcome. The claim language should be very clear, and must not raise any 
doubts about the meaning of the particular phrases. 

8. Furthermore, in the respective claims the applicant states that "a similarity 
computing device to acquire a plurality of documents", but it is not specified based on 
what criteria those documents are selected. The applicant needs to clearly define what 
or why the specific documents are selected. 



Claim Rejections - 35 USC § 102 
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9. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

10. Claims 1, 6, 9 and 12-14 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Seki et al (US Publication 20020143737). Seki discloses a document 
extracting device and a method, comprising: a similarity computing device to acquire 
plurality of documents to be candidates for extraction (Figure 9, elements 903 and 904) 
and computing all degrees of similarity between the documents (paragraph 47); and a 
document extracting device to extract a combination of documents whose sum of the 
degrees of similarity between the documents computed by the similarity computing 
device is the smallest when any number of documents are extracted from among a 
group of the documents (paragraph 49, so that if the documents are similar (duplicate), 
only one document will be outputted to the user, but if those document have small 
similarity then both of them will be outputted). 

Claim Rejections - 35 USC § 103 

1 1 . The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 
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12. Claims 2-5, 7, 8 and 10-11 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Seki et al (US Publication 2002/0143737) in the view of Wyard et al 
(US Patent 6167398). 

As to claims 2, 3, 7 and 10, Seki teaches the similarity computing device and the 
method comprising: a mutual similarity computing functional unit to compute the 
similarity between the documents, however he does not teach that the search is based 
on the document vector involving a character-string frequency computing. Wyard 
teaches an information retrieval system and method that generates weighted 
comparison results to analyze the degree of dissimilarity between documents, wherein 
the word frequency corresponds to character-string frequency. It would have been 
obvious to one of the ordinary skill in the art during the time the invention was made to 
use inverse document frequency technique (i.e. document vector) to measure the 
similarity as taught by Wyard (column 2, lines 5-26) in Seki's similarity determining 
means because inverse document frequency method/algorithm is well know in the art 
and it is relying on term weighting that improves the performance and furthermore it is 
also easy to compute and therefore it is faster than Boolean search or comparison. 

As to claim 4, Wyard further teaches the character -string frequency computing 
functional unit generating document vector obtained by weighting each of the 
documents by TFIDF on the basis of the frequency of appearance of the divided 
character strings (column 8 lines 22-29 and lines 47-55). 
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As to claim 5, Wyard further teaches the mutual similarity computing functional 
unit computing the degrees of similarity between the documents by a vector space 
method on the basis of the document vectors of the documents (column 8, lines 22-29). 

As to claim 8, Wyard further teaches the similarity computing device comprising: 
a character-string-dividing function to divide each of the documents into character 
strings using any one of character string division methods (column 3, lines 36-41); a 
character-string frequency computing function to generate document vectors obtained 
by weighting each of the documents by TFIDF (column 8, lines 47-55) on the basis of 
the frequency of appearance of the divided character strings; and a mutual similarity 
computing function to compute the degrees of similarity between the documents by a 
vector space method on the basis of the document vectors of the documents (column 8, 
lines 22-29). 

As to claim 1 1 , Seki does not teach that the search is based on the document 
vector involving a character-string frequency computing and in particular TFIDF. Wyard 
teaches an information retrieval system and method that generates weighted 
comparison results to analyze the degree of dissimilarity between documents, wherein 
the word frequency corresponds to character-string frequency. Wyard also teaches 
using n-gram string division method (column 3, lines 36-41) and weighting each of the 
documents by TFIDF on the basis of the frequency of occurrence particular character 
string. It would have been obvious to one of the ordinary skill in the art during the time 
the invention was made to use inverse document frequency technique (i.e. document 
vector) to measure the similarity as taught by Wyard (column 2, lines 5-26) in Seki's 
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similarity determining means because inverse document frequency method/algorithm is 
well know in the art and it is relying on term weighting that improves the performance 
and furthermore it is also easy to compute and therefore it is faster than Boolean search 
or comparison. 

The Prior Art 

13. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

- US Patent 6615209 discloses a method for comparing two documents for the 
similarity. 

Inquiry 

14. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Angela M. Lie whose telephone number is 571-272- 
8445. The examiner can normally be reached on M-F. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Don Wong can be reached on 571-272-1834. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 





Angela M Lie 



DON WONG I 
SUPERVISORY ROTENT EXAMINE! 



